Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals

نویسندگان

Fadi Biadsy

Mohammadreza Ghodsi

Diamantino Caseiro

چکیده

Maximum Entropy (MaxEnt) language models are powerful models that can incorporate linguistic and non-linguistic contextual signals in a unified framework with a convex loss. MaxEnt models also have the advantage of scaling to large model and training data sizes We present the following two contributions to MaxEnt training: (1) By leveraging smaller amounts of transcribed data, we demonstrate that a MaxEnt LM trained on various types of corpora can be easily adapted to better match the test distribution of Automatic Speech Recognition (ASR); (2) A novel adaptive-training approach that efficiently models multiple types of non-linguistic features in a universal model. We evaluate the impact of these approaches on Google’s state-of-the-art ASR for the task of voice-search transcription and dictation. Training 10B parameter models utilizing a corpus of up to 1T words, we show large reductions in word error rate from adaptation across multiple languages. Also, human evaluations show significant improvements on a wide range of domains from using non-linguistic features. For example, adapting to geographical domains (e.g., US States and cities) affects about 4% of test utterances, with 2:1 win to loss ratio.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporating Cognitive Linguistic Insights into Classrooms: the Case of Iranian Learners’ Acquisition of If-Clauses

Cognitive linguistics gives the most inclusive, consistent description of how language is organized, used and learned to date. Cognitive linguistics contains a great number of concepts that are useful to second language learners. If-clauses in English, on the other hand, remain intriguing for foreign language learners to struggle with, due to their intrinsic intricacies. EFL grammar books are ...

متن کامل

Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR

German is a morphologically rich language having a high degree of word inflections, derivations and compounding. This leads to high out-of-vocabulary (OOV) rates and poor language model (LM) probabilities in the large vocabulary continuous speech recognition (LVCSR) systems. One of the main challenges in the German LVCSR is the recognition of the OOV words. For this purpose, data-driven morphem...

متن کامل

Life-wise Language Learning Textbooks: Construction and Validation of an Emotional Abilities Scale through Rasch Modeling

Underlying the recently developed notions of applied ELT and life syllabus is the idea that language classes should give precedence to learners’ life qualities, for instance emotional intelligence (EI), over and above their language skills. By so doing, ELT is ascribed an autonomous status and ELT classes can lavish their full potentials to the learners. With that in mind, this study aimed to d...

متن کامل

Incorporating Linguistic Knowledge for Learning Distributed Word Representations

Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge...

متن کامل

Backoff inspired features for maximum entropy language models

Maximum Entropy (MaxEnt) language models [1, 2] are linear models that are typically regularized via well-known L1 or L2 terms in the likelihood objective, hence avoiding the need for the kinds of backoff or mixture weights used in smoothed ngram language models using Katz backoff [3] and similar techniques. Even though backoff cost is not required to regularize the model, we investigate the us...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals

نویسندگان

چکیده

منابع مشابه

Incorporating Cognitive Linguistic Insights into Classrooms: the Case of Iranian Learners’ Acquisition of If-Clauses

Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR

Life-wise Language Learning Textbooks: Construction and Validation of an Emotional Abilities Scale through Rasch Modeling

Incorporating Linguistic Knowledge for Learning Distributed Word Representations

Backoff inspired features for maximum entropy language models

عنوان ژورنال:

اشتراک گذاری